Refined Error Bounds for Several Learning Algorithms
نویسنده
چکیده
This article studies the achievable guarantees on the error rates of certain learning algorithms, with particular focus on refining logarithmic factors. Many of the results are based on a general technique for obtaining bounds on the error rates of sample-consistent classifiers with monotonic error regions, in the realizable case. We prove bounds of this type expressed in terms of either the VC dimension or the sample compression size. This general technique also enables us to derive several new bounds on the error rates of general sample-consistent learning algorithms, as well as refined bounds on the label complexity of the CAL active learning algorithm. Additionally, we establish a simple necessary and sufficient condition for the existence of a distribution-free bound on the error rates of all sample-consistent learning rules, converging at a rate inversely proportional to the sample size. We also study learning in the presence of classification noise, deriving a new excess error rate guarantee for general VC classes under Tsybakov’s noise condition, and establishing a simple and general necessary and sufficient condition for the minimax excess risk under bounded noise to converge at a rate inversely proportional to the sample size.
منابع مشابه
Analysis of Complexity Bounds Random Sets for Pac-learning With
Learnability in Valiant’s pac-learning formalism is reformulated in terms of expected (average) error instead of confidence and error parameters. A finite-domain, random set formalism is introduced to develop algorithm-dependent, distributionspecific analytic error estimates. Two random set theorems for finite concept-spaces are presented to facilitate these developments. Analyses are carried o...
متن کاملBounding the Generalization Error of Convex Combinations of Classiiers: Balancing the Dimensionality and the Margins
A problem of bounding the generalization error of a classiier f 2 conv(H); where H is a "base" class of functions (classiiers), is considered. This problem frequently occurs in computer learning, where eecient algorithms of combining simple classiiers into a complex one (such as boosting and bagging) have attracted a lot of attention. Using Talagrand's concentration inequalities for empirical p...
متن کاملError Bounds for Transductive Learning via Compression and Clustering
This paper is concerned with transductive learning. Although transduction appears to be an easier task than induction, there have not been many provably useful algorithms and bounds for transduction. We present explicit error bounds for transduction and derive a general technique for devising bounds within this setting. The technique is applied to derive error bounds for compression schemes suc...
متن کاملBounds in Terms of Rademacher Averages
So far we have seen how to obtain generalization error bounds for learning algorithms that pick a function from a function class of limited capacity or complexity, where the complexity of the class is measured using the growth function or VC-dimension in the binary case, and using covering numbers or the fatshattering dimension in the real-valued case. These complexity measures however do not t...
متن کاملGeneralization error bounds for learning to rank: Does the length of document lists matter?
We consider the generalization ability of algorithms for learning to rank at a query level, a problem also called subset ranking. Existing generalization error bounds necessarily degrade as the size of the document list associated with a query increases. We show that such a degradation is not intrinsic to the problem. For several loss functions, including the cross-entropy loss used in the well...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of Machine Learning Research
دوره 17 شماره
صفحات -
تاریخ انتشار 2016